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To: Ron Gruner 

From: Steve aial 1 ach 

Topic: Flip Assessment 



1 Introduction 

This memo presents the conclusions and recommendations based 
on two trips to the RTF* facility. Certain assumptions are made 
about the near and long term goals of FhP. where appropriate these 
assumptions will be stated. For purposes of rationally presenting 
my thoughts# the following sections are developed: Name_Space# 
J1 0/ AQ4# re-mi,croorogramming# KQS C3S Kernel)# micropHPr * and* 
cone I us i on , 



2 Name_3oace 

There are two primary forces driving the redesign of 
'lame.boace: Performance of SPL programs and code density. The 
latter with respect to VAX comparisons. 

Taking the latter first. Rather than attempting to explain 
why VAX has better code density# or the validity of the small 
sample# it is worthwhile noting that with the present definition of' 
Name_Space some very interesting side effects results. First of 
ail I # names must' oe 16 bits in length. Why? 



1) A Name in Name.Space is not really a user defined name or 
variable. Consider the existance of the array variable A 
and the integer scalars I and J. References to A# I# J# 
AUJ# and A l J J requires the compiler use 5 names and no(t 3. 
Thus there is a multiplicative effect that generally 
results when array references occur. Of course# array 
references use the longest NTt (128 bits). The present 
effort toj produce new 32 and 64 bit VTE's wills solve al 
major part of this problem. 

2) All the names in independently compiled subroutines when' 
bound into a procedure object# must be unique in that 
object. That is there can not be multiple uses 0 |f the same 
name. In reality# this is somewhat of a lie. This bind 
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strategy must be adopted due to the extensive overhead of 
entering sjbroutine using the procedure object environment! 
data in the object's root. If this strategy were adopted# 
an tt oi.t name would suffice for tiany subroutines# but' of' 
course subroutine call overhead becomes excessive and there 
goes performance. 



Conclusion# proceed with the following redefinition# obtain 
better code density than VAX (for whatever that's worth). 

These issues 1 consider more academic in nature# when compared 
to some other issues that either result in more substantive metrics 
and perceived marketing advantages. Let's enumerate some of these 
points. 

1) Some of the SPL benchmarks run indicate a severe perfor- 
mance penalty as contrasted to the HV/BOQO. Since in a: 
"TYPICAL" system it is not unusual to spend SOX or more in 
the system# this must be corrected. T ne correction being 
identification of the common addressing modes used in NTE's 
and accelerating them. However what thought has been given' 
to NTE reference patterns tor languages other that SPL and 
Fortran. 3 resumab|y CQBJL performance will become an issue 
some day. Does COBOL have a sufficiently similar or 
different' address mode pattern than SP.I and Fortran? If soi 
the present effort also accelerates COBOL. If it does not# 
this should oe considered. I do not know the answer to; 
this question# but someone should provide onei Pascal 
shopld be considered as oart of' this ef-fort. 

d) An opinion already publicaliy discussed is the extent that 
your architecture is superior to a comoetitor is a function* 
of the user perceived benefits. The notion of dial a 

oecision intergers and float in' Fortran was previbiusly 
noted, iflihile not as obvious# but mentioned by some scien- 
tific and technology bigots is the notion of mixed mode 
arithmetic. This was discussed briefly with some people. 
From a performance and cope generator viewpoint# any 
advantages are minimal. however# in the never ending 
search for all the hype and imoact# direct support of mixed 
mode can be’a product differentiator. The basic Name_Space 
structure effectively mimics a data tagged archi tectura. By 
directly supporting data types in a NTE and thus having a 
generic ADO and not Add Integer and ADO Float# some' product 
differentiation can be obtained (the System/3B supports 
this tyoe of "GENERIC ", instruction structure). This 
feature will turn bn certain customers. The most frequen- 
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tly mentioned drawbacks to this feature are the cost of 
implementation and what does It buy since mixed mode 
arithmetics do not occur frequently. 



1 oel*ieve that the latter objection has been answered. The 
first objection is simply a reaction to a change. Prom 
first nand experience on the Burroughs 6700 class of' 
machines and the Kaytheon Data fagged AADC (1 mentioned 
this only to prove the point)# data tagging and mixed mode 
can be supported with NU loss of performance for the 
general case of constant arithmetics. 

fhe entire issue of S.languages# binding subroutines of 
different language together# and tne development of support* 
for newer high level languages as they come along. 



Examining the latter two points and reaching conclusions 
based on the presented facts reveals as follows. Though 
not quite clear from the documentation (if it is not true # 
it could be made to work), multiple Procedure Environment 
Descriptor's can De supported in the same procedure object. 
Assume the following simplifications can be made: The 
static data pointer remains unchanged# the name table 
pointer remains unchanged (there is no reason that names 
across 3-interpreter can not be supported)# a coimmont 
subroutine call and return mechanism across all S-languages 
exists (I oelieve this is the case)# and the 3-ianguage 
identity can be incorporated in the NTE used to name the 
called subroutine. One obvious question? is the mechahism* 
used to invoke the original 3-language of the caller. 
Again# the arch, document is unclear about the macrostate 
stored in the frame pushed on the current stack. X am 
assuming that a! reasonably sized bit f-ield U-B) can* be 
placed on the stack and used to identify the S-language of 
the caller. In effect what has oeen described in a flat 
i nter- 1 anguage call. 



if this is done and the S- language interpreter is present! 
(which it is on sprint)# the overhead of S_language switch 
is minimal (X oelieve 1 or 2 microcycles as worst). More 
will be said about speed versus architecure after the next 
point . 



Are multiple 3. languages a boom or bane ? Borrowing from a 
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past leader, it is neither, they are a canard, J. believe 
the foh'l owing is true aoout S.languages especially with 
respect to FHP. The intersection of the Fort ran, SPL, and 
Cobol ^languages, the instructions in common (forgetting 
opcode assignment), and the disjoint set correlates with 
the Ecliose M/600 or the 'iV/BOQO. The word addressed' 
floating point is certainly the Eclipse Fortran S_language» 
the oyte granular Commercial set is the Coool S. language 
set, and the character instruction ana the various privi- 
leged instructions are the SPL set* It reality alii S- 
1 anguage design has accomplished is permitted'a degree of 
freedom to the compiler designers in determining the 
desired object code to generated. Unlike the B-170Q where 
the compiler writer in addition to instruction semantics 
could choose descriptor format (read NTt format), in FHP 
only instruction semantics are permitted. Already existing 
in the structure of the NTt is the superset notion of all 
the addressing and naming conventions required for the 
anticipated high level languages (and not augmentable by a 
3- 1 ang jage) . 



what does this mean? Other than eliminating multiple 
opcode decoders in the I 3 ' (so I'm told), very little else 
would be gained in the some total of all microcode develo- 
oed for Sprint and I suspect future FHP's. The same 
functionality will always exist. Of course the real 
downside (as you correctly perceived), is that* software 
development may want an Slanguage for each additional 
compiler supoorteo. This could then potentially translate 
into the massive microprogramming effort so feared. 
Classically additional languages required additional! 
run-time support. Ana the run-time support was transporta- 
ble frofm one machine to shot her. I tnink this issue is 
more a question of management control. dy simply dictating 
that until fjrther notice, no additional S- 1 ang jages , and 
that all future compilers must choose one of the available 
languages, you still maintain' the usar oerception oif ai 
benefit of the architecture and you leave open the oppor- 
tunity for augmentation o,f the architecture in an orderly 
way in the future. This approach only makes sense if one of 
the available S„languages can oe used for the more immedi- 
ate compiler (i.e, Pascal, PL/1, C, PPG, ?t)as i c , ?APL) 
develooment efforts 



what this means is that all the technical ano performance 
objections to; 3 w 1anguages can> oe solved with the same leveli 
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of thought as now going in to Name ..Space or recoding the 
Kernel microcode. All other objections can be handleo with 
a firm management committment. 



Une last observation, based on the S_ops listed in the FHP 
arch, document dated Nov/5/79, the intersection of the 
S-languages supports my previous analogy with the MV/bQOG. 
There are aporox i mat e I y 75 SPl S-oos. The intersection. of< 
SPL and Fortran results in just iO additional S-ops being 
defined. As you can imagined, these 30 deal with the 
floating point and character string data types of Fortran. 
Again intersecting this set with Coool, results i n< 4B 
additional S-ops. You guessed it, the additional S-ops 
deal with decimal data types and the editing and searching 
semantics of Cobol, In total 191 unique S..OPS exists. 
What does this analysis mean. One oovious conclusion, is 
to combine all S-ops into one instruction set. This 
eliminates all context switches and program oind prooJems 
among different S-ops modules, What we see, is a scienti- 
fic and commercial instruction set builit on a oase (or SPL) 
instruction set. Given the 8 bit S-op encoding, sufficient 
space i,s left for expansion*. Also aM three base languages 
have the complete instruction set available. Thus, Fortran 
COULD have some of the c ommerc i .a 1 1 capaoi 1 i t i es ' of • Cobo 1 1 (1 
am not advocating this, but only what can be done. Of 
course P-l/l DDLS have both commercial, character, and 
scientific data types). JLiiia- atuaolsL be. seciaua 
ifiasldaxaiLiba. 



The next step is to then realize that most of the instruc- 
tion set has the same semantics applied to different datai 
types. This naturally leads to binding the data type in 
the NTi. Other than the reasons previously given for this 
abstraction, some secondary benefits are; for languages 
with run-time coercion of data types (Nice APL), a naturali 
way exists to support such an interpreter and lastly a 
convenient way is now defined to MATCH the data types of 
input actual argument against the data type expected. The 
Fortran standard says passing the wrong argument type is an 
error. Undefined results occur. There has been many a 
paper that mentions this error as an area that should be 
given aid by the compiler. Software reliability is a big 
selling feature. In t h i 3 case, the! data types (optionally) 
of the passed arguments are matched against a template at 
the calilee'ss site. 
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4) Presently intra-object calls that stay within the current 
domain are expensive. Ihis expense results in multiple 
cooies of the run-time environment existing in memory. A 
copy is bound to every procedure object and not shared. 
Thopigh the architecture suoports shared orbcedjre objects# 
the call time and name cache fill/flush situation reduces 
oerf or mance. it seems inconceivable that* in an* 

architecture# with its object addressing# that the run- 
timer are not shared as a matter of Dolicy. 



There are two suggestions in this area. Provide a PLAT 
intra-object procedure call and return# or sufficiently 
expediate architecturally CALL for this case# and secondly 
redisgo the name cache such that its asociation is on 
AONiiName not just Name. This eliminates the need for name 
cache flush/fill on a return. 



4 U1D/AUN and machine state issues 

&iany of the performance issues# especially during cross dosmain 
call and fault processing is the conversions oetween AON/UID and 
JlD/AUM# and the potential for an excessive amount off machine 
saving and restoring. 

The AON/UiU issue is the more peculiartof the two* Promoted 
as the vechicle to avoid ambiquous names# make software development 
easiliier and moire reliable Cthe abi.lity to encapsulate# at wil*l# 
data or other things in name objects) will probably achieve many of 
these objectives. However# since that cost 0|f hardware is not yet* 
free# the exact construct that software wants to eliminate has 
oecome a burden on the hardware. AjiM's were created so that 
software could easily index into sparse (relative to the length of 
a UID) tables. AH so to eliminate the ourden op the hardware 1 0| 
maintain SO bits of object ID. The goal is noble# but the cure may 
oe worse that the cause. Extensive time is spent converting AUN'toi 
persistant UID during a context swap. This is due to the AON not 
oe persistant. The ATJ must be purged uoon context swap# since it 
associates on AON and not UID. 

There are proposals to fix this proolem. All of which involve 
additional hardware accelerators for UID/AON conversions. Before 
any consideration oe given to applying hardware solutions# an 
analysis must be made of the design of the AONl/UiD abstraction. 
Afterahl# this is the root of the problem*. Nhije I am not as yet* 
finished with my analysis (in reality in cooperation with Steve 
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Schleimer and Doug ttellslz some simplifications are worth noting. 
Among them are: 



1) Contracting the UIU from 80 bits to 64 bits, 

2! If ajn's remain/ allocate the lowers half Ithe first b Ml t oi 
sy st em ma i nt a i ned. Thus this AUN need never be converted 
to UID and the construction of- a split A TU becomes 
feasible. (he system AfU need not be flushed on context 
switches^ since its AUN associations are s/stem-wide. 
(Already implemented/ conceptually or in reality on the 
Prime 780/ VAX/ and the ''IV/8Q0Q.! 

3! Investigate the possibility for defining for FHP„1 (using 
the nomenclature in one of your memos! to a 1 <2 8 pointer for 
which only 64 bits are "ACflVELy" interpreted. Actively is 
presently undefined. Or siimo*!// for MP_l only supporting 
a 64 bit pointer/ but allocating 128 bits in main memory, 
fhus/ when it becomes judicious/ all 128 of the pointer; 
have meaning. It is my understanding that in the first 
releases of the 08/ UIO's (for the user! are hot supported. 
The System/38 employs a similar approach. Iheir architec- 
ture provides for a 40 bi,t segment numoer/ only 24 of which 
is supported in the first incarnation. i4fiCi.il 

is. tal&z&i- t2u ccfi&fiJL. ’ CH.'ciL 

aaccasv 

4J Since only four domains are supported/ incorporate the 
protection access bits with physical address generation. 
Thus the ATU serves two purposes (Protection validity and 
physical address generation! and tne protection cache is 
eliminated. 



Machine state issues involve the excessive machine state that’ 
is either created across machine instructions or saved as a result 
of page faults. The excessive machine state created comes about 
due to the definition and implementation of the CALL instruction, 
Needless to say this is not a surprise and an extensive effort is 
underway to correct this situation. 

The state save issue with resoect toi sage faults may not oe soi 
easy to fix. txperience with the MV/8000 indicates that in the 
vast majority of time a short context block (only useable state 
need be saved!. In effect most instructions are restartable. This 
is not due to the implementation nor the architecture but as a> 
result of the fact that most instructions perform very simple 
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operations (the same is true in Sprint - examine the s-op distribu- 
tion for SPL programs). 1 he resoonse to wiy Sorint can hot do this 
(.other than the present design) generally involves responses like! 
For worse case we have to Oe capable of saving all stater the page 
fault handler in microcooe using additional frames on the 
microstack# thereby creating state that must be saved# and the page 
fault handler resides on its own unique virtual processor thus a 
orocess switch mjst be performed. Again the problem manifests 
itself due to the high level design and not the implementation. 

This design must oe re-evaluated. There are two many cases 
that the microcode is structured solely to mimic a high level 
software construct without the aoprooriate reduction to a hardware 
control mechanism. Ihe advantages of this high level abstraction 
are clear (they hapoen to be Huber's £'. E » thesis at M.I. T ). 
However the same result can be achieved by a redesign that puts 
some adoi t i orna 1 1 burden oh^ the kernel software. This burden would' 
relieved of the microcooe and make it feasible to simply restart 
instructions. Ultimately, state saving and restoring is more 
efficient due to the elimination of additional microstate. 



4 KUS - kernel 

This discussion pertains to the certain capabilities KOS does 
not suoport that it should. A US and A DS/V 3* permit three prop ess 
types to exi st iswappable# pr e-empt i b 1 e # and resident. As near as I 
can telil# <03 only permits swaopable (or geherail ourpose interac- 
tive user), if the first product offerring only contains Fortran 
and/or Sprint is to sell into the real-time marketplace* some 
additional capabilities over and above that which is presently 
supported must be orovided. 

These capabilities must include! l)the notion of a resident 
orocess. <03 already permits this by virtue of the page fault 
handler virtual processor. It seems appropriate that a user 
visiole virtual) orocsssor type of' resident is appropriate. Other- 
wise there i s no guaranteed interrupt response time d)the ability 
tb wire and unwire pages of a resident's processes working set. 

Effectively# the resident process of AUS/VS does not mean that 
the entire process is resident# only that the ?wiKE and ?U4i/UKci 
calls are supported. Pages that are not wired are faulted in and 
out . 



Additionally there appears to be no notion of multi-tasking 
within JPOS. The comment in reaction' to this statement was tnati 
multiple processes can be used. However# processes are expensive 
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to create* manage* and provide commune i at i ons between as 
to tasks. Tie qjestion of mu 1 1 i -t ask i ig on tie MV/bOOQ 
support under AUS/VS ana Fortran '77 was a very common 
during customer oresentat i ons. 

A question you can answer is to what extent does Sprint have 
to be oerceived as a high end offerring of the current product 
line? txpanoing one step further is Sprint's relationship with 
tCLlPS- hardware and software the same as SBS's re 1 at i on 3h i p i 






compared 
and ’ i ,t s 
quest i on 



S He-m i c roprogramm i ng 

Clearly identified as one of the critical redesigns necesarry 
for performance enchancement. The organization of the approach 
taken oy L, Scniller should get accent ab i e results. If ope were tot 
read his worxplan* the interesting notion of microcode generation 
from idM's P./5 is brought uo. Independent of S-languages* the 
amount of microcode needed to be developed for FHP with its exten- 
sive supoort for US functions would dictate 1 that the teasi oi 1 i ty of- 
generating microcode in this fashion at least be examined. 



b mi c r oMP 

Several suggestions were made relative to the block diagrams 
presented. Ihe major one being incorporating a smaller name cache 
within the ALU chip. This eliminates the need for one of' the chips 
and reduces the risk of the project. Uf course the performance 
consequences of this are not quantified. Additional analysis could 
not oe done without a detailed determination of the chip area 
required for all the listed functions and blocks. Mitchell real- 
ises that this is the next critical* step i v the oroject. 



7 Cone 1 usi on 

The above sections have enumerated suggestions for some 
changes to the architecture/implementation of Sprint. It has been 
assumed that the goals and objectives of sorint (as you said at 
your staff meeting that I was present at) was to make the computing 
world and Data Genera) say that this is worith waiting for and/or 
this is the best thing since white bread. 

The way the architecture contributes to- this is directly 
proportional to the perceived user benefits. Clearly performance 
i,s one obvious user benefit. i/Vhetstope at tie 2000 level is very 
good* though higher would be better with no incremental product 
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cost, hill the vectonzer co this'/ It would be highly desirable 
to have this ready by announcement. 

A stronger statement concerning compatibility would help. 
Cbmpat i o i 1 i .t y comes in many flavors. I he MV/60Q0 chose binary# 
Sprint has somewhat chosen high level language and AUS file level, 
[his should De sufficient with the right amount of marketing hype 
and technical backup. Many present DG users though impressed with 
the oinary comaetibity of the MV/8000 would maye oeen satisfied 
with a recompile. In fact Cobol users must recompile. 1 don't 
known what the situation isr but the Sprint Fortran' '77 or some 
mechanical translator should be able to compile Eclipse Fortran 5 
to Sorint. Another desiraole feature of Fortran should oe~ the 
capability to compile IBM's Fortran. Prime's Fortran does this# 
and they get their fair share of business via this route. 

In fact a generic approach of compiling IBM fortran and Sobol 
should enhance Sorint. 

A tact brought up by many DG customers is the lack of a 32-bit 
ous for the MV/8000. OECi uses their IH7SQ attachment to the Sid to 
their advantage in many marketing situations. 

Since Sorint is inherently a tightly coupled multi-processor#' 
provide the mechanisms that permit two JP's controlled by one I0P. 
:> erhaps this is the mid-like kicker so frequently mentioned. 

Of course the real win would be to announce any type of 
tandem# non-stop# or other ARM features. these features always 
help sell. It is not clear# within the time contraints of Sprint 
announcement goals what can be accomol i shed. However here are some 
ideas that may help Sprint's ARM story. 

ID Permit the memory and I/Di controllers ooards to oe electri- 
cally disconnected from the host processor without powering 
down tie eitire bay. 

2) Permit a form of graceful degradation by microcoding the 
E-80X functionality into tne fetch unitv fhus it any ot< 
the 3 E-BOX boards fail# processing can continue at slower 
performance. 

3) Permit the processor to continue functioning with TBS 
replaced with a 8053 or equivalent. One of the often 
mentioned remarks concerning the MV/B000 is the sensitivity 
of the machine to the flopoy# M3C# and 8053. Customers 
wanted to know if backup units could be made available. 
Alhi.le customers may have soere 6053's# I doubt a spare* IBS 
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will b s av ai I ao 1 e . 



Graceful degradation as a form of high availability is as 
valid an aoproach as duplication (though not guaranteeing the same 
level of availability!. In an uni -processor that may be the best 
one coul d hope for. 
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